In the following, I’ll illustrate how to load data and calculate jaccards

package ‘tidyverse’ was built under R version 3.6.2Registered S3 methods overwritten by 'dbplyr':
  method         from
  print.tbl_lazy     
  print.tbl_sql      
── Attaching packages ──────────────────────────────────────────────────────── tidyverse 1.3.1 ──
✓ ggplot2 3.3.5     ✓ purrr   0.3.4
✓ tibble  3.1.3     ✓ dplyr   1.0.7
✓ tidyr   1.1.3     ✓ stringr 1.4.0
✓ readr   1.4.0     ✓ forcats 0.5.1
package ‘ggplot2’ was built under R version 3.6.2package ‘tibble’ was built under R version 3.6.2package ‘tidyr’ was built under R version 3.6.2package ‘readr’ was built under R version 3.6.2package ‘purrr’ was built under R version 3.6.2package ‘dplyr’ was built under R version 3.6.2package ‘forcats’ was built under R version 3.6.2── Conflicts ─────────────────────────────────────────────────────────── tidyverse_conflicts() ──
x dplyr::filter() masks stats::filter()
x dplyr::lag()    masks stats::lag()

The following will of course be specific to your environment

data<-read_csv("./output/tweetjson006_annotated_tweets.csv")
Missing column names filled in: 'X1' [1]
── Column specification ─────────────────────────────────────────────────────────────────────────
cols(
  .default = col_double(),
  tweet_created_at = col_datetime(format = ""),
  tweet_text = col_character(),
  tweet_entities = col_character(),
  tweet_public_metrics = col_character(),
  tweet_referenced_tweets_id = col_character(),
  tweet_referenced_tweets_type = col_character(),
  date_floor = col_datetime(format = "")
)
ℹ Use `spec()` for the full column specifications.
data

Note that all topic assignments are “topic_X” comments

First, visualizing at different levels of aggregation. My strategy here is to select the data that I want to manipuate, then pivot, then aggregate. I’m going to write this code using functions to make things more modular, but you don’t have to.

Prepping the data

The strategy here is to summarize by day in each column, join with a date sequence, pivot, then fill NAs.


prep_data<-function(d) {
  d %>% mutate(day = as.Date(floor_date(tweet_created_at,unit="day"))) %>% group_by(day) %>% summarise_at(vars(starts_with("topic_")),sum)->d
  seq_dates<-tibble(day = seq.Date(from=min(d$day),to=max(d$day),by="day"))
  d <- d %>% right_join(seq_dates) %>% select(day, starts_with("topic_")) %>% pivot_longer(names_to = "topic", values_to = "weight", starts_with("topic_")) %>% replace_na(list(weight=0))
}

data.s <- prep_data(data)
Joining, by = "day"
data.s

Looks good!

Graphing the data

Once again, I’ll do this as a function. Note, I’m going to shuffle the colors around here to help me see the topic boundaries. The default discrete palette is “hue ordered”, making hard to see where the boundaries are.

library(scales)
library(colorspace)

plot_topics<-function(long_data) {
  num_topics = length(unique(long_data$topic))
  
  # Going to use a trick here to make sure I get distant colors next to one another
  cols <- hue_pal()(num_topics)
  half <- 1:ceiling(length(cols)/2)
  cols <-lighten(muted(as.vector(rbind(cols[half],cols[-half]))),.5)
  g<-ggplot(long_data)+geom_area(aes(x=day,y=weight,fill=topic)) + scale_fill_manual(values = cols)+guides(fill=guide_legend(ncol=2))
  return(g)
}
plot_topics(data.s)
number of columns of result is not a multiple of vector length (arg 2)

Ok, so that sort of sucks, so we’ll do a little aggregation. I’m going to add a function here to bin the data. Also adding a normalization parameter if I want to look at proportions.

bin_data<-function(long_data,num_days,normalize = F) {
  d<- long_data%>% ungroup() %>% mutate(index = floor(as.numeric(day - min(day)) / num_days)) %>% group_by(index,topic) %>% summarise(weight = sum(weight),day = min(day)) %>% ungroup() %>% select(-index)
   if (normalize) {
    d %>% group_by(day) %>% mutate(weight = weight / sum(weight))-> d
  }
  return(d)
}

binned_data <- bin_data(data.s,7)
`summarise()` has grouped output by 'index'. You can override using the `.groups` argument.
plot_topics(binned_data)+ggtitle("Binning by 7 days")
number of columns of result is not a multiple of vector length (arg 2)

Ok, this is visually a bit jarring, but I can begin to see the individual topics. Let’s look at a few more.

binned_data <- bin_data(data.s,15)
`summarise()` has grouped output by 'index'. You can override using the `.groups` argument.
plot_topics(binned_data)+ggtitle("Binning by 15 days")
number of columns of result is not a multiple of vector length (arg 2)

binned_data <- bin_data(data.s,15,T)
`summarise()` has grouped output by 'index'. You can override using the `.groups` argument.
plot_topics(binned_data)+ggtitle("Binning by 15 days,normalized")
number of columns of result is not a multiple of vector length (arg 2)

I notice some interesting variance in topic 5 and topic 3 in the early part of 2020, but otherwise, nothing tremendously useful. Might be nice to label the topics right on the graph, but we can do that later. See this stack over flow post.

Also, it occurs to me that I could smooth this out quite a bit by rolling a window over the data. I’m going to use RCppRoll, and I’ll use mean values rather than sums

roll_data<-function(long_data,win_size = 5, by = 1, normalize = F) {
  # To make life easier, I'm going to pivot my long data to wide
  wd<-pivot_wider(long_data,names_from = topic,values_from = weight) %>% arrange(day)
  rolled<-as_tibble(apply(wd %>% select(starts_with("topic_")),2,function(x) roll_mean(x,n = win_size,by = by)))

  win_ends <- roll_max(1:nrow(wd),n=win_size,by=by)

  rolled$day = wd$day[win_ends]
  r<-rolled %>% select(day,everything()) %>% pivot_longer(names_to = "topic", values_to = "weight", starts_with("topic_"))
  if (normalize) {
    r %>% group_by(day) %>% mutate(weight = weight / sum(weight))-> r
  }
  return(r)
  
}

roll_data(data.s,7,1)

Looks ok. Let’s try it out. Expect to see much smoother graph.

rolled_data <- roll_data(data.s,15,1)
plot_topics(rolled_data)+ggtitle("Rolling by 15 days")
number of columns of result is not a multiple of vector length (arg 2)

Double checking - if we advance by 15 days at a time, this should look very similar to the binned data

rolled_data <- roll_data(data.s,15,15)
plot_topics(rolled_data)+ggtitle("Rolling by 15 days")
number of columns of result is not a multiple of vector length (arg 2)

Great, finally, with normalization

rolled_data <- roll_data(data.s,15,1, T)
plot_topics(rolled_data)+ggtitle("Rolling by 15 days, delta = 1, normalized")
number of columns of result is not a multiple of vector length (arg 2)

Calculate Weighted Jaccards

Using the above, we’ll create a weighted jaccards function

weighted_jaccard<-function(x,y) {
  n<-sum(pmin(x,y))
  d<-sum(pmax(x,y))
  ifelse(d==0,0,n/d)
}

# Presume our data has already been binned / rolled
calc_topic_churn<-function(long_data) {
  long_data %>% group_by(topic) %>% arrange(day,.by_group = TRUE) %>% mutate(lagged_weights = lag(weight,1,order_by = day)) -> lagged_data
  #return(lagged_data)
  lagged_data %>% filter(!is.na(lagged_weights)) %>% group_by(day) %>% summarise(jaccard = weighted_jaccard(weight,lagged_weights))
}

calc_topic_churn(data.s)

Looks good, so checking plotting


ggplot(calc_topic_churn(data.s))+geom_line(aes(day,jaccard))

Now with binning


rolled_data<-roll_data(data.s,7,by=7)
ggplot(calc_topic_churn(rolled_data))+geom_line(aes(day,jaccard))+theme_minimal()+ylim(0,1)

Cosine similarity

We can do the same thing with cosine similarity.

cosine_similarity<-function(x,y) {
  if (length(x) != length(y)) {
    stop("x and y must be equal length vectors")
  }
  n = sum(x*y)
  d = sqrt(sum(x^2))*sqrt(sum(y^2))
  ifelse(d==0,0,n/d)
}

# Presume our data has already been binned / rolled
calc_cosine_similarity<-function(long_data) {
  long_data %>% group_by(topic) %>% arrange(day,.by_group = TRUE) %>% mutate(lagged_weights = lag(weight,1,order_by = day)) -> lagged_data
  #return(lagged_data)
  lagged_data %>% filter(!is.na(lagged_weights)) %>% group_by(day) %>% summarise(similarity = cosine_similarity(weight,lagged_weights))
}

calc_cosine_similarity(data.s)

Looks good. Plotting as before, comparing the two.


ggplot(calc_cosine_similarity(data.s))+geom_line(aes(day,similarity))+theme_minimal()+ylim(0,1)+ggtitle("Cosine similarity")

ggplot(calc_topic_churn(data.s))+geom_line(aes(day,jaccard))+theme_minimal()+ylim(0,1)+ggtitle("Jaccard")


rolled_data<-roll_data(data.s,7,7)

ggplot(calc_cosine_similarity(rolled_data))+geom_line(aes(day,similarity))+theme_minimal()+ylim(0,1)+ggtitle("Cosine similarity")

ggplot(calc_topic_churn(rolled_data))+geom_line(aes(day,jaccard))+theme_minimal()+ylim(0,1)+ggtitle("Jaccard")

NA
NA

Looking at entropy

One last potential measure here - we’ll have a look at entropy. Note that entropy is calculated within a window, rather than by comparing two windows. Also, entropy is not normalized.

entropy<-function(x,base=exp(1)) {
  p = x/sum(x)
  -sum(p*log(p,base))  
}

# Presume our data has already been binned / rolled
calc_entropy<-function(long_data) {
  long_data %>% group_by(day) %>% summarise(entropy = entropy(weight))
}

calc_entropy(data.s)
ggplot(calc_entropy(data.s))+geom_line(aes(day,entropy))+theme_minimal()+ggtitle("Entropy")

I find this a little unintuitive though, so using the definition of skew from Introne & Goggins (2015)

skew<-function(x) {
  if (length(x)==0) {
    return(0)
  } else {
    p = x/sum(x)
    1 - exp(-sum(p*log(p)))/length(x)
  }
}

# Presume our data has already been binned / rolled
calc_skew<-function(long_data) {
  long_data %>% group_by(day) %>% summarise(skew = skew(weight))
}

calc_skew(data.s)
ggplot(calc_skew(data.s))+geom_line(aes(day,skew))+ylim(0,1)+theme_minimal()+ggtitle("Skew")

Great. This indicates that there’s a pretty even balance here across the topics over time.

rolled_data<-roll_data(data.s,7,1)

ggplot(calc_skew(rolled_data))+geom_line(aes(day,skew))+ylim(0,1)+theme_minimal()+ggtitle("Skew")

LS0tCnRpdGxlOiAiTG9hZCBhbmQgVmlzdWFsaXplICIKb3V0cHV0OiBodG1sX25vdGVib29rCi0tLQoKSW4gdGhlIGZvbGxvd2luZywgSSdsbCBpbGx1c3RyYXRlIGhvdyB0byBsb2FkIGRhdGEgYW5kIGNhbGN1bGF0ZSBqYWNjYXJkcwoKYGBge3IgZWNobz1GYWxzZX0KbGlicmFyeSh0aWR5dmVyc2UpCgpgYGAKClRoZSBmb2xsb3dpbmcgd2lsbCBvZiBjb3Vyc2UgYmUgc3BlY2lmaWMgdG8geW91ciBlbnZpcm9ubWVudAoKYGBge3J9CmRhdGE8LXJlYWRfY3N2KCIuL291dHB1dC90d2VldGpzb24wMDZfYW5ub3RhdGVkX3R3ZWV0cy5jc3YiKQpkYXRhCmBgYAoKTm90ZSB0aGF0IGFsbCB0b3BpYyBhc3NpZ25tZW50cyBhcmUgInRvcGljX1giIGNvbW1lbnRzCgpGaXJzdCwgdmlzdWFsaXppbmcgYXQgZGlmZmVyZW50IGxldmVscyBvZiBhZ2dyZWdhdGlvbi4gIE15IHN0cmF0ZWd5IGhlcmUgaXMgdG8gc2VsZWN0IHRoZSBkYXRhIHRoYXQgSSB3YW50IHRvIG1hbmlwdWF0ZSwgdGhlbiBwaXZvdCwgdGhlbiBhZ2dyZWdhdGUuICBJJ20gZ29pbmcgdG8gd3JpdGUgdGhpcyBjb2RlIHVzaW5nIGZ1bmN0aW9ucyB0byBtYWtlIHRoaW5ncyBtb3JlIG1vZHVsYXIsIGJ1dCB5b3UgZG9uJ3QgaGF2ZSB0by4KCiMjIyBQcmVwcGluZyB0aGUgZGF0YQoKVGhlIHN0cmF0ZWd5IGhlcmUgaXMgdG8gc3VtbWFyaXplIGJ5IGRheSBpbiBlYWNoIGNvbHVtbiwgam9pbiB3aXRoIGEgZGF0ZSBzZXF1ZW5jZSwgcGl2b3QsIHRoZW4gZmlsbCBOQXMuCgpgYGB7cn0KCnByZXBfZGF0YTwtZnVuY3Rpb24oZCkgewogIGQgJT4lIG11dGF0ZShkYXkgPSBhcy5EYXRlKGZsb29yX2RhdGUodHdlZXRfY3JlYXRlZF9hdCx1bml0PSJkYXkiKSkpICU+JSBncm91cF9ieShkYXkpICU+JSBzdW1tYXJpc2VfYXQodmFycyhzdGFydHNfd2l0aCgidG9waWNfIikpLHN1bSktPmQKICBzZXFfZGF0ZXM8LXRpYmJsZShkYXkgPSBzZXEuRGF0ZShmcm9tPW1pbihkJGRheSksdG89bWF4KGQkZGF5KSxieT0iZGF5IikpCiAgZCA8LSBkICU+JSByaWdodF9qb2luKHNlcV9kYXRlcykgJT4lIHNlbGVjdChkYXksIHN0YXJ0c193aXRoKCJ0b3BpY18iKSkgJT4lIHBpdm90X2xvbmdlcihuYW1lc190byA9ICJ0b3BpYyIsIHZhbHVlc190byA9ICJ3ZWlnaHQiLCBzdGFydHNfd2l0aCgidG9waWNfIikpICU+JSByZXBsYWNlX25hKGxpc3Qod2VpZ2h0PTApKQp9CgpkYXRhLnMgPC0gcHJlcF9kYXRhKGRhdGEpCmRhdGEucwpgYGAKCkxvb2tzIGdvb2QhCgojIyMgR3JhcGhpbmcgdGhlIGRhdGEKCk9uY2UgYWdhaW4sIEknbGwgZG8gdGhpcyBhcyBhIGZ1bmN0aW9uLiBOb3RlLCBJJ20gZ29pbmcgdG8gc2h1ZmZsZSB0aGUgY29sb3JzIGFyb3VuZCBoZXJlIHRvIGhlbHAgbWUgc2VlIHRoZSB0b3BpYyBib3VuZGFyaWVzLiAgVGhlIGRlZmF1bHQgZGlzY3JldGUgcGFsZXR0ZSBpcyAiaHVlIG9yZGVyZWQiLCBtYWtpbmcgaGFyZCB0byBzZWUgd2hlcmUgdGhlIGJvdW5kYXJpZXMgYXJlLgoKYGBge3IgZmlnLndpZHRoPTE1LGZpZy5oZWlnaHQ9NX0KbGlicmFyeShzY2FsZXMpCmxpYnJhcnkoY29sb3JzcGFjZSkKCnBsb3RfdG9waWNzPC1mdW5jdGlvbihsb25nX2RhdGEpIHsKICBudW1fdG9waWNzID0gbGVuZ3RoKHVuaXF1ZShsb25nX2RhdGEkdG9waWMpKQogIAogICMgR29pbmcgdG8gdXNlIGEgdHJpY2sgaGVyZSB0byBtYWtlIHN1cmUgSSBnZXQgZGlzdGFudCBjb2xvcnMgbmV4dCB0byBvbmUgYW5vdGhlcgogIGNvbHMgPC0gaHVlX3BhbCgpKG51bV90b3BpY3MpCiAgaGFsZiA8LSAxOmNlaWxpbmcobGVuZ3RoKGNvbHMpLzIpCiAgY29scyA8LWxpZ2h0ZW4obXV0ZWQoYXMudmVjdG9yKHJiaW5kKGNvbHNbaGFsZl0sY29sc1staGFsZl0pKSksLjUpCiAgZzwtZ2dwbG90KGxvbmdfZGF0YSkrZ2VvbV9hcmVhKGFlcyh4PWRheSx5PXdlaWdodCxmaWxsPXRvcGljKSkgKyBzY2FsZV9maWxsX21hbnVhbCh2YWx1ZXMgPSBjb2xzKStndWlkZXMoZmlsbD1ndWlkZV9sZWdlbmQobmNvbD0yKSkKICByZXR1cm4oZykKfQpwbG90X3RvcGljcyhkYXRhLnMpCmBgYAoKT2ssIHNvIHRoYXQgc29ydCBvZiBzdWNrcywgc28gd2UnbGwgZG8gYSBsaXR0bGUgYWdncmVnYXRpb24uICBJJ20gZ29pbmcgdG8gYWRkIGEgZnVuY3Rpb24gaGVyZSB0byBiaW4gdGhlIGRhdGEuICBBbHNvIGFkZGluZyBhIG5vcm1hbGl6YXRpb24gcGFyYW1ldGVyIGlmIEkgd2FudCB0byBsb29rIGF0IHByb3BvcnRpb25zLgoKYGBge3IgZmlnLndpZHRoPTE1LCBmaWcuaGVpZ2h0PTV9CmJpbl9kYXRhPC1mdW5jdGlvbihsb25nX2RhdGEsbnVtX2RheXMsbm9ybWFsaXplID0gRikgewogIGQ8LSBsb25nX2RhdGElPiUgdW5ncm91cCgpICU+JSBtdXRhdGUoaW5kZXggPSBmbG9vcihhcy5udW1lcmljKGRheSAtIG1pbihkYXkpKSAvIG51bV9kYXlzKSkgJT4lIGdyb3VwX2J5KGluZGV4LHRvcGljKSAlPiUgc3VtbWFyaXNlKHdlaWdodCA9IHN1bSh3ZWlnaHQpLGRheSA9IG1pbihkYXkpKSAlPiUgdW5ncm91cCgpICU+JSBzZWxlY3QoLWluZGV4KQogICBpZiAobm9ybWFsaXplKSB7CiAgICBkICU+JSBncm91cF9ieShkYXkpICU+JSBtdXRhdGUod2VpZ2h0ID0gd2VpZ2h0IC8gc3VtKHdlaWdodCkpLT4gZAogIH0KICByZXR1cm4oZCkKfQoKYmlubmVkX2RhdGEgPC0gYmluX2RhdGEoZGF0YS5zLDcpCnBsb3RfdG9waWNzKGJpbm5lZF9kYXRhKStnZ3RpdGxlKCJCaW5uaW5nIGJ5IDcgZGF5cyIpCmBgYApPaywgdGhpcyBpcyB2aXN1YWxseSBhIGJpdCBqYXJyaW5nLCBidXQgSSBjYW4gYmVnaW4gdG8gc2VlIHRoZSBpbmRpdmlkdWFsIHRvcGljcy4gIExldCdzIGxvb2sgYXQgYSBmZXcgbW9yZS4KCmBgYHtyIGZpZy53aWR0aD0xNSwgZmlnLmhlaWdodD01fQpiaW5uZWRfZGF0YSA8LSBiaW5fZGF0YShkYXRhLnMsMTUpCnBsb3RfdG9waWNzKGJpbm5lZF9kYXRhKStnZ3RpdGxlKCJCaW5uaW5nIGJ5IDE1IGRheXMiKQpgYGAKCmBgYHtyIGZpZy53aWR0aD0xNSwgZmlnLmhlaWdodD01fQpiaW5uZWRfZGF0YSA8LSBiaW5fZGF0YShkYXRhLnMsMTUsVCkKcGxvdF90b3BpY3MoYmlubmVkX2RhdGEpK2dndGl0bGUoIkJpbm5pbmcgYnkgMTUgZGF5cyxub3JtYWxpemVkIikKYGBgCkkgbm90aWNlIHNvbWUgaW50ZXJlc3RpbmcgdmFyaWFuY2UgaW4gdG9waWMgNSBhbmQgdG9waWMgMyBpbiB0aGUgZWFybHkgcGFydCBvZiAyMDIwLCBidXQgb3RoZXJ3aXNlLCBub3RoaW5nIHRyZW1lbmRvdXNseSB1c2VmdWwuICBNaWdodCBiZSBuaWNlIHRvIGxhYmVsIHRoZSB0b3BpY3MgcmlnaHQgb24gdGhlIGdyYXBoLCBidXQgd2UgY2FuIGRvIHRoYXQgbGF0ZXIuIFNlZSBbdGhpcyBzdGFjayBvdmVyIGZsb3cgcG9zdF0oaHR0cHM6Ly9zdGFja292ZXJmbG93LmNvbS9xdWVzdGlvbnMvMTAzOTM5NTYvYWRkLWRpcmVjdC1sYWJlbHMtdG8tZ2dwbG90Mi1nZW9tLWFyZWEtY2hhcnQpLgoKQWxzbywgaXQgb2NjdXJzIHRvIG1lIHRoYXQgSSBjb3VsZCBzbW9vdGggdGhpcyBvdXQgcXVpdGUgYSBiaXQgYnkgcm9sbGluZyBhIHdpbmRvdyBvdmVyIHRoZSBkYXRhLiAgSSdtIGdvaW5nIHRvIHVzZSBSQ3BwUm9sbCwgYW5kIEknbGwgdXNlIG1lYW4gdmFsdWVzIHJhdGhlciB0aGFuIHN1bXMKCmBgYHtyfQpyb2xsX2RhdGE8LWZ1bmN0aW9uKGxvbmdfZGF0YSx3aW5fc2l6ZSA9IDUsIGJ5ID0gMSwgbm9ybWFsaXplID0gRikgewogICMgVG8gbWFrZSBsaWZlIGVhc2llciwgSSdtIGdvaW5nIHRvIHBpdm90IG15IGxvbmcgZGF0YSB0byB3aWRlCiAgd2Q8LXBpdm90X3dpZGVyKGxvbmdfZGF0YSxuYW1lc19mcm9tID0gdG9waWMsdmFsdWVzX2Zyb20gPSB3ZWlnaHQpICU+JSBhcnJhbmdlKGRheSkKICByb2xsZWQ8LWFzX3RpYmJsZShhcHBseSh3ZCAlPiUgc2VsZWN0KHN0YXJ0c193aXRoKCJ0b3BpY18iKSksMixmdW5jdGlvbih4KSByb2xsX21lYW4oeCxuID0gd2luX3NpemUsYnkgPSBieSkpKQoKICB3aW5fZW5kcyA8LSByb2xsX21heCgxOm5yb3cod2QpLG49d2luX3NpemUsYnk9YnkpCgogIHJvbGxlZCRkYXkgPSB3ZCRkYXlbd2luX2VuZHNdCiAgcjwtcm9sbGVkICU+JSBzZWxlY3QoZGF5LGV2ZXJ5dGhpbmcoKSkgJT4lIHBpdm90X2xvbmdlcihuYW1lc190byA9ICJ0b3BpYyIsIHZhbHVlc190byA9ICJ3ZWlnaHQiLCBzdGFydHNfd2l0aCgidG9waWNfIikpCiAgaWYgKG5vcm1hbGl6ZSkgewogICAgciAlPiUgZ3JvdXBfYnkoZGF5KSAlPiUgbXV0YXRlKHdlaWdodCA9IHdlaWdodCAvIHN1bSh3ZWlnaHQpKS0+IHIKICB9CiAgcmV0dXJuKHIpCiAgCn0KCnJvbGxfZGF0YShkYXRhLnMsNywxKQpgYGAKTG9va3Mgb2suICBMZXQncyB0cnkgaXQgb3V0LiAgRXhwZWN0IHRvIHNlZSBtdWNoIHNtb290aGVyIGdyYXBoLgoKYGBge3IgZmlnLndpZHRoPTE1LCBmaWcuaGVpZ2h0PTV9CnJvbGxlZF9kYXRhIDwtIHJvbGxfZGF0YShkYXRhLnMsMTUsMSkKcGxvdF90b3BpY3Mocm9sbGVkX2RhdGEpK2dndGl0bGUoIlJvbGxpbmcgYnkgMTUgZGF5cyIpCmBgYApEb3VibGUgY2hlY2tpbmcgLSBpZiB3ZSBhZHZhbmNlIGJ5IDE1IGRheXMgYXQgYSB0aW1lLCB0aGlzIHNob3VsZCBsb29rIHZlcnkgc2ltaWxhciB0byB0aGUgYmlubmVkIGRhdGEKCmBgYHtyIGZpZy53aWR0aD0xNSwgZmlnLmhlaWdodD01fQpyb2xsZWRfZGF0YSA8LSByb2xsX2RhdGEoZGF0YS5zLDE1LDE1KQpwbG90X3RvcGljcyhyb2xsZWRfZGF0YSkrZ2d0aXRsZSgiUm9sbGluZyBieSAxNSBkYXlzLCBkZWx0YSA9IDE1IikKYGBgCgpHcmVhdCwgZmluYWxseSwgd2l0aCBub3JtYWxpemF0aW9uCgpgYGB7ciBmaWcud2lkdGg9MTUsIGZpZy5oZWlnaHQ9NX0Kcm9sbGVkX2RhdGEgPC0gcm9sbF9kYXRhKGRhdGEucywxNSwxLCBUKQpwbG90X3RvcGljcyhyb2xsZWRfZGF0YSkrZ2d0aXRsZSgiUm9sbGluZyBieSAxNSBkYXlzLCBkZWx0YSA9IDEsIG5vcm1hbGl6ZWQiKQpgYGAKCiMgQ2FsY3VsYXRlIFdlaWdodGVkIEphY2NhcmRzCgpVc2luZyB0aGUgYWJvdmUsIHdlJ2xsIGNyZWF0ZSBhIHdlaWdodGVkIGphY2NhcmRzIGZ1bmN0aW9uCgpgYGB7cn0Kd2VpZ2h0ZWRfamFjY2FyZDwtZnVuY3Rpb24oeCx5KSB7CiAgbjwtc3VtKHBtaW4oeCx5KSkKICBkPC1zdW0ocG1heCh4LHkpKQogIGlmZWxzZShkPT0wLDAsbi9kKQp9CgojIFByZXN1bWUgb3VyIGRhdGEgaGFzIGFscmVhZHkgYmVlbiBiaW5uZWQgLyByb2xsZWQKY2FsY190b3BpY19jaHVybjwtZnVuY3Rpb24obG9uZ19kYXRhKSB7CiAgbG9uZ19kYXRhICU+JSBncm91cF9ieSh0b3BpYykgJT4lIGFycmFuZ2UoZGF5LC5ieV9ncm91cCA9IFRSVUUpICU+JSBtdXRhdGUobGFnZ2VkX3dlaWdodHMgPSBsYWcod2VpZ2h0LDEsb3JkZXJfYnkgPSBkYXkpKSAtPiBsYWdnZWRfZGF0YQogICNyZXR1cm4obGFnZ2VkX2RhdGEpCiAgbGFnZ2VkX2RhdGEgJT4lIGZpbHRlcighaXMubmEobGFnZ2VkX3dlaWdodHMpKSAlPiUgZ3JvdXBfYnkoZGF5KSAlPiUgc3VtbWFyaXNlKGphY2NhcmQgPSB3ZWlnaHRlZF9qYWNjYXJkKHdlaWdodCxsYWdnZWRfd2VpZ2h0cykpCn0KCmNhbGNfdG9waWNfY2h1cm4oZGF0YS5zKQpgYGAKTG9va3MgZ29vZCwgc28gY2hlY2tpbmcgcGxvdHRpbmcKYGBge3IgZmlnLndpZHRoPTE1LCBmaWcuaGVpZ2h0PTV9CgpnZ3Bsb3QoY2FsY190b3BpY19jaHVybihkYXRhLnMpKStnZW9tX2xpbmUoYWVzKGRheSxqYWNjYXJkKSkKCmBgYApOb3cgd2l0aCBiaW5uaW5nCmBgYHtyIGZpZy53aWR0aD0xNSwgZmlnLmhlaWdodD01fQoKcm9sbGVkX2RhdGE8LXJvbGxfZGF0YShkYXRhLnMsNyxieT03KQpnZ3Bsb3QoY2FsY190b3BpY19jaHVybihyb2xsZWRfZGF0YSkpK2dlb21fbGluZShhZXMoZGF5LGphY2NhcmQpKSt0aGVtZV9taW5pbWFsKCkreWxpbSgwLDEpCgpgYGAKIyMgQ29zaW5lIHNpbWlsYXJpdHkKCldlIGNhbiBkbyB0aGUgc2FtZSB0aGluZyB3aXRoIGNvc2luZSBzaW1pbGFyaXR5LgoKYGBge3J9CmNvc2luZV9zaW1pbGFyaXR5PC1mdW5jdGlvbih4LHkpIHsKICBpZiAobGVuZ3RoKHgpICE9IGxlbmd0aCh5KSkgewogICAgc3RvcCgieCBhbmQgeSBtdXN0IGJlIGVxdWFsIGxlbmd0aCB2ZWN0b3JzIikKICB9CiAgbiA9IHN1bSh4KnkpCiAgZCA9IHNxcnQoc3VtKHheMikpKnNxcnQoc3VtKHleMikpCiAgaWZlbHNlKGQ9PTAsMCxuL2QpCn0KCiMgUHJlc3VtZSBvdXIgZGF0YSBoYXMgYWxyZWFkeSBiZWVuIGJpbm5lZCAvIHJvbGxlZApjYWxjX2Nvc2luZV9zaW1pbGFyaXR5PC1mdW5jdGlvbihsb25nX2RhdGEpIHsKICBsb25nX2RhdGEgJT4lIGdyb3VwX2J5KHRvcGljKSAlPiUgYXJyYW5nZShkYXksLmJ5X2dyb3VwID0gVFJVRSkgJT4lIG11dGF0ZShsYWdnZWRfd2VpZ2h0cyA9IGxhZyh3ZWlnaHQsMSxvcmRlcl9ieSA9IGRheSkpIC0+IGxhZ2dlZF9kYXRhCiAgI3JldHVybihsYWdnZWRfZGF0YSkKICBsYWdnZWRfZGF0YSAlPiUgZmlsdGVyKCFpcy5uYShsYWdnZWRfd2VpZ2h0cykpICU+JSBncm91cF9ieShkYXkpICU+JSBzdW1tYXJpc2Uoc2ltaWxhcml0eSA9IGNvc2luZV9zaW1pbGFyaXR5KHdlaWdodCxsYWdnZWRfd2VpZ2h0cykpCn0KCmNhbGNfY29zaW5lX3NpbWlsYXJpdHkoZGF0YS5zKQpgYGAKCkxvb2tzIGdvb2QuIFBsb3R0aW5nIGFzIGJlZm9yZSwgY29tcGFyaW5nIHRoZSB0d28uCgpgYGB7ciBmaWcud2lkdGg9MTUsIGZpZy5oZWlnaHQ9NX0KCmdncGxvdChjYWxjX2Nvc2luZV9zaW1pbGFyaXR5KGRhdGEucykpK2dlb21fbGluZShhZXMoZGF5LHNpbWlsYXJpdHkpKSt0aGVtZV9taW5pbWFsKCkreWxpbSgwLDEpK2dndGl0bGUoIkNvc2luZSBzaW1pbGFyaXR5IikKZ2dwbG90KGNhbGNfdG9waWNfY2h1cm4oZGF0YS5zKSkrZ2VvbV9saW5lKGFlcyhkYXksamFjY2FyZCkpK3RoZW1lX21pbmltYWwoKSt5bGltKDAsMSkrZ2d0aXRsZSgiSmFjY2FyZCIpCgpgYGAKYGBge3IgZmlnLndpZHRoPTE1LCBmaWcuaGVpZ2h0PTV9Cgpyb2xsZWRfZGF0YTwtcm9sbF9kYXRhKGRhdGEucyw3LDcpCgpnZ3Bsb3QoY2FsY19jb3NpbmVfc2ltaWxhcml0eShyb2xsZWRfZGF0YSkpK2dlb21fbGluZShhZXMoZGF5LHNpbWlsYXJpdHkpKSt0aGVtZV9taW5pbWFsKCkreWxpbSgwLDEpK2dndGl0bGUoIkNvc2luZSBzaW1pbGFyaXR5IikKZ2dwbG90KGNhbGNfdG9waWNfY2h1cm4ocm9sbGVkX2RhdGEpKStnZW9tX2xpbmUoYWVzKGRheSxqYWNjYXJkKSkrdGhlbWVfbWluaW1hbCgpK3lsaW0oMCwxKStnZ3RpdGxlKCJKYWNjYXJkIikKCgpgYGAKIyMgTG9va2luZyBhdCBlbnRyb3B5CgpPbmUgbGFzdCBwb3RlbnRpYWwgbWVhc3VyZSBoZXJlIC0gd2UnbGwgaGF2ZSBhIGxvb2sgYXQgZW50cm9weS4gIE5vdGUgdGhhdCBlbnRyb3B5IGlzIGNhbGN1bGF0ZWQgd2l0aGluIGEgd2luZG93LCByYXRoZXIgdGhhbiBieSBjb21wYXJpbmcgdHdvIHdpbmRvd3MuICBBbHNvLCBlbnRyb3B5IGlzIG5vdCBub3JtYWxpemVkLgoKYGBge3J9CmVudHJvcHk8LWZ1bmN0aW9uKHgsYmFzZT1leHAoMSkpIHsKICBwID0geC9zdW0oeCkKICAtc3VtKHAqbG9nKHAsYmFzZSkpICAKfQoKIyBQcmVzdW1lIG91ciBkYXRhIGhhcyBhbHJlYWR5IGJlZW4gYmlubmVkIC8gcm9sbGVkCmNhbGNfZW50cm9weTwtZnVuY3Rpb24obG9uZ19kYXRhKSB7CiAgbG9uZ19kYXRhICU+JSBncm91cF9ieShkYXkpICU+JSBzdW1tYXJpc2UoZW50cm9weSA9IGVudHJvcHkod2VpZ2h0KSkKfQoKY2FsY19lbnRyb3B5KGRhdGEucykKYGBgCmBgYHtyIGZpZy53aWR0aD0xNSwgZmlnLmhlaWdodD01fQpnZ3Bsb3QoY2FsY19lbnRyb3B5KGRhdGEucykpK2dlb21fbGluZShhZXMoZGF5LGVudHJvcHkpKSt0aGVtZV9taW5pbWFsKCkrZ2d0aXRsZSgiRW50cm9weSIpCmBgYApJIGZpbmQgdGhpcyBhIGxpdHRsZSB1bmludHVpdGl2ZSB0aG91Z2gsIHNvIHVzaW5nIHRoZSBkZWZpbml0aW9uIG9mIHNrZXcgZnJvbSBJbnRyb25lICYgR29nZ2lucyAoMjAxNSkKCmBgYHtyfQpza2V3PC1mdW5jdGlvbih4KSB7CiAgaWYgKGxlbmd0aCh4KT09MCkgewogICAgcmV0dXJuKDApCiAgfSBlbHNlIHsKICAgIHAgPSB4L3N1bSh4KQogICAgMSAtIGV4cCgtc3VtKHAqbG9nKHApKSkvbGVuZ3RoKHgpCiAgfQp9CgojIFByZXN1bWUgb3VyIGRhdGEgaGFzIGFscmVhZHkgYmVlbiBiaW5uZWQgLyByb2xsZWQKY2FsY19za2V3PC1mdW5jdGlvbihsb25nX2RhdGEpIHsKICBsb25nX2RhdGEgJT4lIGdyb3VwX2J5KGRheSkgJT4lIHN1bW1hcmlzZShza2V3ID0gc2tldyh3ZWlnaHQpKQp9CgpjYWxjX3NrZXcoZGF0YS5zKQpgYGAKCmBgYHtyIGZpZy53aWR0aD0xNSwgZmlnLmhlaWdodD01fQpnZ3Bsb3QoY2FsY19za2V3KGRhdGEucykpK2dlb21fbGluZShhZXMoZGF5LHNrZXcpKSt5bGltKDAsMSkrdGhlbWVfbWluaW1hbCgpK2dndGl0bGUoIlNrZXciKQpgYGAKR3JlYXQuICBUaGlzIGluZGljYXRlcyB0aGF0IHRoZXJlJ3MgYSBwcmV0dHkgZXZlbiBiYWxhbmNlIGhlcmUgYWNyb3NzIHRoZSB0b3BpY3Mgb3ZlciB0aW1lLgoKYGBge3IgZmlnLndpZHRoPTE1LCBmaWcuaGVpZ2h0PTV9CnJvbGxlZF9kYXRhPC1yb2xsX2RhdGEoZGF0YS5zLDcsMSkKCmdncGxvdChjYWxjX3NrZXcocm9sbGVkX2RhdGEpKStnZW9tX2xpbmUoYWVzKGRheSxza2V3KSkreWxpbSgwLDEpK3RoZW1lX21pbmltYWwoKStnZ3RpdGxlKCJTa2V3IikKYGBgCg==